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(54) Document modification apparatus and image processing appparatus 



(57) Automatic region extracting means (2) extracts 
rectangle regions having attributes of "character", "pho- 
tograph", "table", "ruled line", "frame" from input image 
data through Image input means (1) and stores the'in- 
fonnation of the extracted rectangle regions into Modi- 
fication information storage means (3). Display means 
(4) displays the input image including the extracted rec- 
tangle regions according to the infomiation of the ex- 
tracted rectangle regions. The operator selects desired 
extracted rectangle regions in the input image on a dis- 



play screen and specifies the kind of the modification for 
the selected rectangle regions by using Operation 
means (5). Thereby, the infonmation of both the selected 
rectangle regions and specified modifications are stored 
in Modification information storage means (3). Modifica- 
tion image making means (6) then makes the image da- 
ta modified based on the infomiation of the selected rec- 
tangle regions, the specified modification information, 
and the input image data, and Image output means (7) 
outputs the modified image. 
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Description 

CROSS-REFERENCE TO RELATED APPLICATION 

[0001] This application is based upon and claims the 
benefit of priority from the prior Japanese Patent Appli- 
cation No.2000-012034, Hied January 20. 2000; the en- 
tire contents of which are incorporated herein by refer- 
ence. 

BACKGROUND OF THE INVEfsmON 

1 . Field of the Invention 

[0002] The present invention relates to a document 
modification apparatus and an image processing appa- 
ratus equipped with the document modification appara- 
tus, for modifying image data obtained by reading a 
manuscript such as a document. 

2. Description of the Related Art 

[0003] Conventional document modification appara- 
tuses have performed crosshatching, underiining, and 
enhancing for characters and photo images in a target 
document to be modified. There are many types of the 
document modification apparatuses, which are well 
known, such as Tablet digitizer type, Coordinate input 
type, Document region reading type, and so on. Those 
conventional document modification apparatuses have 
following drawbacks. 

[0004] In the tablet digitizer type, an operator speci- 
fies an optional region in a document placed on a digi- 
tizer by using a pencil, and also designates a modifica- 
tion type for the designated region, and after this, it is 
necessary for the operator to put the document on the 
document table again. Accordingtyi there Is a drawback 
that this type of the conventional document modification 
apparatuses causes a shift of a position to be modified. 
[0005] In the coordinate input type, the coordinate of 
a target modification region observed from the standard 
point is predicted by an operator in advance under the 
state that the document is set on the document table, 
and it is then necessary for the operator to input this 
coordinate through an operation section. This introduc- 
es a drawback to take more time. 
[0006] In the document region reading type, it is nec- 
essary for an operator to mark directly a region in a doc- 
ument to be modified by using a maricer pencil. Accord- 
ingly, this type of the conventional document modiftea- 
tion apparatuses causes a drawback to stain the docu- 
ment with ink. 

[0007] In order to eliminate those conventional draw- 
backs described above, a conventional pre-scan display 
method has been proposed, in which an image input 
means reads the document placed on a document table 
and a display device then displays the image of the doc- 
ument. The operator then specifies a region in the doc- 



ument to be modified while watching the image on the 
display device. In particularly, there is a region specify- 
ing method of extracting a target region from a docu- 
ment and of specifying a modification information for the 

s extract region, that has been disclosed in the Japanese 
patent document (Japanese laid open publication No. 
4-157876). In order to increase the precision of the des- 
ignation of the modification position, this conventional 
technique uses a method of designating the region by 

10 extracting a binary image region in the original docu- 
ment and an intermediate graduation region and by dis- 
playing a distribution relationship of them. This also 
causes to decrease the operator's work during the des- 
ignation process for the modification position. 

15 [0008] By the way, although the above conventional 
technkiue of Japanese laid-open publication No. 
4-1 57876 can handle a document, as a target to be mod- 
ified, only including characters and photographs, it can- 
not separate characters and cut characters from a table 

20 and a frame in a document, and it further cannot cut ceils 
from a table in a document. Thus, the conventional tech- 
niques have drawback to limit the types of the docu- 
ments as the target to be modified. 



[0009] Accordingly, an object of the present invention 
Is. with due consideration to the drawbacks of the con- 
ventional technique, to provide a document modification 

30 apparatus and an Image processing apparatus 
equipped with the document modification apparatus 
with a high versatility, which are capable of reducing the 
operator's work to handle a document including charac- 
ters, a photograph, a table, a ruled line, and a frame, 

35 and capable of perfonnlng the modification process ef- 
ficiently. 

[001 0] In accordance with a preferred embodiment of 
the present invention, a document modification appara- 
tus for modifying Image data read by image input means 

40 comprises region extracting means, region attribute 
judgment means, region selection means, modification 
specifying means, and modification Image making 
means. The region extracting means extracts rectangle 
regions as the target regions to be modified from the 

45 input image data. The region attribute judgment means 
judges whether an attribute of each rectangle region is 
one of at least more than two kinds of attributes "char- 
acter*, and "photograph". The region selection means 
selects target regions to be modified from the plurality 

50 of regions through an operator. The modification speci- 
fying means specifies kinds of the modiflcattons for the 
target regions selected by the region selection means 
through the operator The modification image making 
means makes a modified image, based on the kinds of 

55 the modlfteattons, in the regions in the image data se- 
lected by the region selection means, specified by the 
modification specifying means. 
[001 1 ] In the document modif teation means according 
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to the present invention described above, the region at- 
tribute judgment means Judges whether an attribute of 
each rectangle region that has been extracted is one of 
attributes such as "character", "photograph", "table", 
"ruled line", and "frame". Each of the attributes that have 
been set in advance is one of "character", "photograph", 
table", "ruled line", and "frame". 
[001 2] In the document modification means according 
to the present invention described above, the region ex- 
tracting means integrates the rectangle region, whose 
attribute has been judged as "character* by the region 
attribute judgment means, per line and paragraph, and 
the region selection means selects the target region to 
be modified per line and paragraph through the opera- 
tor. 

[001 3] In the document modification means according 
to the present invention described above, the region ex- 
tracting means displays on a display screen the rectan- 
gle regions extracted by the region extracting means 
with the image data read by the image input means, and 
selects whether each rectangle region on the display 
screen is modified or not through the operator. 
[001 4] In the document modification means according 
to the present invention described above, the region se- 
lection means moves the cursor to the rectangle region 
in the input image and blinks the rectangle region indi- 
cated by the cursor so that the operator selects whether 
this rectangle region is modified. After the selection of 
the rectangle region to be modified, the region selection 
means moves the cursor the following rectangle region. 
These operations are repeated. 
[001 5] In the document modification means according 
to the present invention described above, the modifica- 
tion instruction means displays an at-a-glance menu 
showing the infomnation regarding the kinds of the mod- 
ification, and selects the modification, to be applied to 
the selected rectangle regions, from the kinds of the 
modifications shown in the at-a-glance menu through 
the operator. 

[001 6] In the document modification means according 
to the present invention described above, the modifica- 
tion image making means comprises memory means for 
storing position information of the selected rectangle re- 
gions by the region selection means and the modifica- 
tion infonmation regarding the kinds of the modifications 
specified by the modification specifying means, and the 
modification Image making means perfonns the modifi- 
cation for the image data read by the image input means 
based on the position information and the modification 
stored in the memory means. 

[0017] In the document modification means according 
to the present invention described above, the apparatus 
further comprises resolution conversion means for 
changing a resolution of the input image data to a re- 
duced image; and display means for displaying the re- 
duced image obtained by the resolution conversion 
means with the rectangle regions extracted by the re- 
gion extracting means. 



4 

[001 8] In accordance with another preferred embodi- 
ment of the present invention, a document modif cation 
apparatus for modifying image data read by image input 
means comprises region extracting means, automatk: 

s modification means, and modification image making 
means. The region extracting means extracts a plurality 
of regions from the image data, each region being a unit 
to be modified. The automatk; modification means au- 
tomatk^ally selects target regions to be modified from the 

10 plurality of regions, and automatically modifies the se- 
lected target regions based on modifications that have 
been set in advance. The modifteation image making 
means makes an image modified image in the target re- 
gions selected by the automatic modificafion means 

IS based on the kinds of the modifications detemiined by 
the automatic modification means. 
[001 9] In the document modif k:ation means according 
to the present invention described above, the automatic 
modification means determines the kind of the modifi- 

20 cation to be applied to each selected target region in 
consideration of the attribute for the selected target re- 
gion and the position of the selected target region in the 
input image data. 

[0020] in the document modification means according 
25 to the present invention described above, the region ex- 
tracting means comprises region attribute judgment 
means for judging an attribute of each region, and the 
attribute of each region to be judged by the region at- 
tribute judgment means is one of attributes "character", 
30 "photograph", "table", "ruled line", and "frame". 

[0021 ] In the document modification means according 
to the present invention described above, the image in- 
put means converts the input image data to binary im- 
age data. 

35 [0022] in accordance with another preferred embodi- 
ment of the present invention, an image processing ap- 
paratus comprises image input means for reading im- 
age data from a document, the document modification 
apparatus of the present invention for making modified 

40 image by modifying the input image data obtained by 
the image input means, and image output means for out- 
putting the modified image obtained by the document 
modification apparatus. 



[0023] These and other objects, features, aspects 
and advantages of the present invention will become 
more apparent from the following detailed description of 
the present invention when taken In conjunction with the 
accompanying drawings, in which: 

FIG.1 is a block diagram showing an image 
processing apparatus equipped with a document 
modification apparatus according to a first embod- 
iment of the present invention; 
FIG 2 is a block diagram showing an example of a 
detailed configuration of an automatic region ex- 
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tracting means shown in FIG.1: 
FIG. 3 is a blocl( diagram showing a detaDed con- 
figuration of an operation means shown in FIG.1 : 
FIG.4 is a diagram showing an input image; 
FIG.5 is a diagram showing an example of a binary 
image that Is converted from original Image data; 
FIG.6 is a diagram showing an example of an image 
after a black pixel connecting process has been 
completed; 

FIG.7 Is a diagram showing an example of a detect- 
ed outline image and a circumscribed rectangle Im- 
age after the outline has been detected; 
FIG.8 is a flow chart showing a procedure of a re- 
gion attribute judgment procedure process per- 
formed by the automatic region extracting means; 
FIG.9 is a diagram of a table showing judgment data 
items to be used In the judgment whether an ex- 
tracted rectangle region is one of attributes, char- 
acter, ruled line, or other; 

FIGs.lOA, 10B, and 1 0C are diagrams showing ex- 
amples of Images after a projection process has 
been completed; 

FiG.11 is a diagram of a table showing judgment 
data items to be used in the judgment whether an 
extracted rectangle region is one of attributes, pho- 
tograph, table, or frame; 

FIG.1 2 is a diagram showing examples of extracted 
rectangle regions that are extracted per attribute; 
FIGS.13A. 13B, and 130 are diagrams showing 
conditions for extracting tines in rectangle regions; 
FIG.1 4 Is a diagram showing an example of the at- 
tributes of extract regions; 
FIGS.15A and 15B are diagrams showing condi- 
tions of extracting paragraphs in a rectangle region; 
FIGS.16A, 168, and 16C are diagrams showing 
conditions for extracting paragraphs in a rectangle 
region: 

FIG.1 7 is a diagram showing an example of a result 
of a region extract operation performed by the re- 
gion extracting means; 

FIG. 18 is a diagram showing a display example on 
a display means; 

FIG.1 9 is a diagram showing another display exam- 
ple on the display means; 
FIG.20 is a diagram showing another display exam- 
ple on the display means; 

FIG.21 is a diagram showing another display exam- 
ple on the display means; 

FIG.22 is a flow chart showing an operator's proce- 
dure for selecting modification regions and for spec- 
ifying modification contents; 
FIG.23 Is a block diagram showing an Image 
processing apparatus equipped with a document 
modification apparatus according to a second em- 
bodiment of the present invention; and 
FIG.24 is a diagram showing an example of data in 
a table set in an automatic modification means 
shown in FIG.23. 



DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

[0024] Other features of this invention will become ap- 
s parent through the following description of preferred em- 
bodiments whk:h are given for illustration of the inven- 
tion and are not intended to be limiting thereof. 

First embodiment 

10 

[0025] FIG.1 is a block diagram showing the image 
processing apparatus equipped with the document 
modification apparatus according to the first embodi- 
ment of the present Invention. 

'5 [0026] The image processing apparatus comprises: 
an image input means 1 ; an automatic region extracting 
means 2; a modification information storage means 3; 
a display means 4; an operation means 5; a modification 
image making means 6; and an Image output means 7. 

20 [0027] The image input means 1 reads a target doc- 
ument to be modified and inputs it therein. The automat- 
k; region extracting means 2 (corresponding to both re- 
gion extracting means and region attribute judgment 
means in claims) extracts each attribute such as a char- 

2s acter, a photograph, a table, a ruled line, a frame, and 
so on from the target document that has been read by 
the image input means 1 . The modification infomiation 
storage means 3 stores position information of the ex- 
tract region and kinds of modification information to be 

30 applied to the extract region. The display means 4 dis- 
plays the Input Image of the document, each extract Im- 
age that is extracted from the input image of the docu- 
ment, an Image as a target of the modification designat- 
ed by an operator, and a finally modified Image. Through 

35 the operation means 5. an operator specifies a desired 
modification to the extract regions in the Image dis- 
played on the display means 4. The modification image 
making means 6 makes a modified Image obtained by 
modifying the image of the Input document according to 

^ the designation of the operator The image output 
means 7 prints the modified image on a print sheet. 
[0028] FIG.2 Is a block diagram showing an example 
of a detailed configuration of the automatic region ex- 
tracting means shown in FIG.1 . As shown in FIG.2, the 
automatk: region extracting means 2 comprises: a blna- 
rization means 21 ; a black pixel connecting means 22; 
an outline trace means 23; a rectangle Information stor- 
age means 24; a circumscribed rectangle Integration 
means 25; a judgment means 26 for character and ruled 

so line; a projection means 27; an extracting means 28 for 
line and paragraph; a Judgment means 29 for table, pho- 
tograph, and frame; an extracting means 30 for a cell, 
a row, and a column; and an attribute region extracting 
means 31 . 

55 [0029] The blnarization means 21 converts the origi- 
nal Image data Into binary image data. The black pixel 
connecting means 22 connects binary black pixels (bi- 
nary black picture elements). The outline trace means 
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23 makes an outOne image of the binary black pixel 
block (binary black picture element block). 
[0030] The rectangle infonmation storage means 24 
stores position infomnation for the rectangle that is cir- 
cumscribed to the outline image obtained by the outline 
trace means 23, and also stores position information of 
a line and paragraph to the rectangle regran of the orig- 
inal image data, and position infonmation of a region ex- 
tracted out per attribute, ' cell "row", "column", and 
"frame" in a rectangle region that have been processed 
by projection process. 

[0031] The circumscribed rectangle integration 
means 25 integrates rectangle regions that are over- 
lapped or circumscribed to each other based on the po- 
sition information of the rectangles stored in the rectan- 
gle information storage means 24. 
[0032] The judgment means 26 for character and 
ruled-line judges whether each rectangle region, that 
has been integrated, corresponds to each of the at- 
tributes such as "character", "ruled-line", and so on. 
[0033] The projection means 27 takes a projection of 
the rectangle region of the original image data of the 
attributes other than the attribute "character", The ex- 
tracting means 28 for iine and paragraph extracts a line 
and a paragraph from the rectangle region after the 
completion of the projection process. The judgment 
means 29 for table, photograph, and frame judges 
whether the rectangle region after the completion of the 
projection process is one of a table, a photograph, and 
a frame. 

[0034] The extracting means 30 for cell, row, and col- 
umn extracts a celt, a row, a column from the rectangle 
region that has been judged as a table or a photograph. 
The attribute region extracting means 31 extracts a re- 
gion per attribute of image data from the rectangle re- 
gion that has been judged as a frame. 
[0035] Contents stored in the rectangle infonnation 
storage means 24 are outputted and then stored in the 
modification information storage means 3. 
[0036] FIG .3 is a block diagram showing a detailed 
configuration of the operation means 5 shown in FIG.1 . 
The operation means 5 comprises: a specifying means 
52; a modification region selection means 51: and a 
modification content selection means 53. The specifying 
means 52 selects a region as a target to be modified in 
the extract region displayed by the display means 4, and 
through which the operator determines and specifies a 
modification content for the selected region. The modi- 
fication region selection means 51 selects the modifica- 
tion region by sequentially moving a cursor on the dis- 
play means 4 according to the position information of 
the extract region stored in the modification information 
storage means 3 and the designated contents obtained 
from the specifying means 52. The modification content 
selection means 53 displays the menu of the modiftea- 
tion contents (the kinds of the modifteatlon operations) 
for the region, for which the designation for the modifi- 
cation is provided, in which the extract region is dis- 



played on the display means 4. The modifteation content 
selectton means 53 further stores the modification con- 
tent that has been designated into the modification in- 
fonmation storage means 3. 

5 [0037] Next, a description will be given of the opera- 
tion according to the first embodiment. 
[0038] When reading a target document to be modi- 
fied, the image input means 1 obtains the input image, 
for example, as shown in FtG.4. This input image is then 

10 stored into the image memory 11 (as image storage 
means) temporarily in the image input means 1 . 
[0039] This input image can be obtained as follows: 
When an operator instructs to start a pre-scan process, 
a light is irradiated onto the target document, and a line 

15 sensor such as CCD receives the reflected light from 
the target document and the CCD then converts the re- 
flected light to electrical signals (density signals) as the 
input mi age. 

[0040] After this process, both the automatic region 
20 extracting means 2, the display means 4, and the mod- 
ifteation image making means 6 input the input images 
stored in the image memoiy 1 1 in the image input means 
1. 

[0041 1 The automatic region extracting means 2 judg- 
es es whether the original image data as the input image 
belong to rectangle regions corresponding to one of at- 
tributes such as "character", "photograph", "table", 
"ruled line", and "frame", and then extracts the rectangle 
regions from the original image data. The automatic re- 
30 gion extracting means 2 then groups the obtained rec- 
tangle regions into a character region per line or para- 
graph, a table region per cell, row, column, and table. 
The automatic region extracting means 2 then stores 
those grouped regions into the modification infomriation 
35 storage means 3. 

[0042] The automatic region extracting means 2 han- 
dles the important function, as one of the features of the 
present invention, for the modification to the image re- 
gion of various kinds of the attributes. 
40 [0043] Hereinafter, the operation of the automatic re- 
gion extracting means 2 will be explained. 
[0044] The binarization means 21 inputs the original 
image data, for example, as shown in FIG.5, stored in 
the image memory 11 in the image input nneans 1 , and 
^5 then converts the input image data to binary data. By 
the way, in the above operation, it is necessary to read 
the original image data with a resolution where the in- 
terval between adjacent lines in the binary data can be 
recognized. In this prefen-ed embodiment, the pre-scan 
^0 is periomried with a resolution of 1 00 dpi. 

[0045] The black pixel connection means 22 scans 
the binary data in a main scan direction. When continu- 
ous white pixels are not more than four pixels (3pt), the 
black pixel connection means 22 converts these contin- 
ss uous white pixels Into black pixels in order to obtain the 
image where the black pixel blocks are connected, as 
shown in FIG.6. By the way, it is also possible to perform 
the outline trace process, that will be described later. 
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instead of the above black pixel connection process. 
[0046] l-iowever. the above black pixel connection 
process can eliminate smaller regions that cause the oc- 
currence of a failure of the execution of the circum- 
scribed rectangle integration process. In addition, the 5 
above black pixel connection process can decrease the 
total number of the black pixel blocks, and this can re- 
duce the size of the data to be stored into the rectangle 
infomiation storage means 23, that will be explained lat- 
er. 10 
[0047] The outline trace means 23 cuts circumscn*bed 
rectangles based on the well-known technique in which 
the outline of the black pixel block is traced. (For exam- 
ple, see "Digital picture processing". Azriel Rosenfeld, 
Avinash C. Kak, Academk; Press, 1976.) 15 
[0048] In a concrete example, the image data after the 
black pixel connection processing is scanned along the 
main scan direction, and the coordinate of the detected 
black pixel is used as a trace start coordinate, and the 
outline of the black pixel block is traced in order to obtain so 
the outline image until the trace is returned to the trace 
start coordinate. Then, the position information for the 
rectangle circumscribed to this outline image is stored 
into the rectangle Information storage means 24. This 
prefen'ed embodiment reduces the processing time by 25 
omitting images that have already been detected In the 
rectangle region during the scan process. 
[0049] The circumscribed rectangle integration 
means 25 integrates the rectangle regions that are over- 
lapped and also circumscribed to each other based on so 
the position infomiation stored in the rectangje informa- 
tion storage means 24. Further, when the integration is 
perfomned, the circumscribed rectangle integration 
means 25 updates the position infomiation stored in the 
rectangle information storage means 24 with the posi- 35 
tion infomiation of the integrated rectangles. 
[0050] This integration process obtains the circum- 
scribed rectangle image, as shown in FIG.7. In this em- 
bodiment, the condition of the circumscribed rectangle 
region is that the distance of a part, that is the mostly ^o 
circumscribed between adjacent rectangles, is not less 
than three pixels (2pt). 

[0051] The regions that have been extracted by the 
above procedure are classified into one of the attributes, 
"character", "photograph", "table", "ruled line", and 
"frame". 

[0052] The operation for the above classification will 
be explained with reference to the flowchart shown In 

FIG.8. 

[0053] The judgment means 26 for character and so 
ruled line calculates various values of each rectangle 
region such as a height Hs, a width Ws, an aspect ratio 
HsWs (ratio of Height to Width), and Ws/Hs (ratio of 
Width to Height) based on the position inf omnatlon of the 
rectangle regions stored in the rectangle Infomiation S5 
storage means 24. In addition, the judgment means 26 
compares the height Hs, the width Ws, the ratio Hs/Ws. 
the ratio Ws/Hs. and first to third threshold values, and 



judges whether each rectangle region is classified into 
one of attributes, "character*, "ruled line", and "others" 
based on the conditions and attributes shown in the ta- 
ble of FIG.9. 

[0054] The attribute judgment results (character, 
ruled line) obtained are stored in the rectangle informa- 
tion storage means 23 at Steps S2 to S4. Those proc- 
esses are repeated until there is no longer un-processed 
rectangle region (Step S5). 

[0055] The results of the test for a plurality of target 
documents indtoate that the height Hs of the attribute 
"character* is not less than 6pt and less than 48pt, the 
height Hs of the attribute "ruled line" is less than 6pt, 
and each of the ratios Hs/Ws and Ws/Hs of the attribute 
"ruled line" is 1 6 times or more when compared with 
those of the attnljute "character". 
[0056] In the present embodiment, it has been set that 
the first, second, and third threshold values are Th=8 
(6pt), Tr=16, and Tc=66(48pt), respecthfely. 
[0057] Next, a description will be given of the process- 
ing for the rectangle region that has been judged as the 
attribute "others". 

[0058] The projection means 27 executes the opera- 
tion to obtain both projections of the original image data 
in vertical and horizontal directions corresponding to the 
rectangle region stored In the image memory means 11 
In the Image input means 1 at Step S6. FIGs.lOA, 10B, 
and 10C show the projection data obtained at Step S6. 
[0059] At Step S7, the judgment means 29 for table, 
photograph, and frame judges whether the attribute of 
the rectangle region Is one of a table, a photograph, and 
a frame based on the conditions and attributes shown 
in F1G.1 1 according to the number of peaks correspond- 
ing to the attribute "ruled line" whose height Is adequate- 
ly higher or whose width is narrower than the height Hs 
or the width Ws of the rectangle region. 
[0060] For example, in FIG.1 OA, one peak is detected 
from the projection data in horizontal direction. Thereby, 
the judgment means 29 judges that the attribute of the 
rectangle region shown In FIG. IDA is "photograph". 
[0061] In addition, in FIG.1 OB, four peaks are detect- 
ed from the projection data in both vertbal and horizon- 
tal directions. Thereby, the judgment means 29 judges 
that the attribute of the rectangle region shown in FIG. 
10B is "table". 

[0062] Similarly, in FIG.1 DC. two peaks are detected 
from the projection data in both vertical and horizontal 
directions. Thereby, the judgment means 29 judges that 
the attribute of the rectangle region shown in FIG.1 OC 
is "frame". The results of the judgment of the attribute 
are stored in the rectangle infonrrtation storage means 
23. 

[0063] Then,' in Step S8 for the rectangle region 
whose attribute has been Judged as "table", the posi- 
tions of cells in the rectangle region are determined 
based on the positions of the peaks in the projection da- 
ta, and the positions of the cells, rows (cells are con- 
nected in row direction), and column (cells are connect- 
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ed in column direction) are stored into the rectangle In- 
formation storage means 24. 

[0064] In Steps S1 1 and S12 tor the rectangle region 
whose attn'bute has been judged as "frame' , a series 
of the processes, detection of a black pixel, the trace of 
an outline, the detection of a circumscribed rectangle, 
the integration of closed rectangles, and the Judgment 
of attribute is performed recursively. Thereby, as shown 
In FIG.12, In the image data In the frame is extracted 
into parts con-esponding to the attributes of "character", 
"photograph", "table", and "ruled line". 
[0065] In Step SI 3 for the rectangle region whose at- 
tribute has been judged as "character", rectangle re- 
gions having the possibility as being in a same line are 
extracted based on the coordinate in a sub-scan direc- 
tion in each rectangle region, and then extracted them 
are grouped when they satisfy the following conditions: 
[0066] In general, it is said that a person can read a 
document easily when a space between adjacent lines 
is 0.5 times of a height of a character and a space be- 
tween adjacent paragraphs is 3 times of the height of 
the character. In this embodiment, these conditions are 
used as the conditions for making a group. 

(A) Condition to extract "line" (see Fig. ISA to Fig.ISC) 

[0067] FIG.13A shows the extraction condition when 
an interval of the sub-scan direction in a rectangle is 
within 0.5 times of the height of a character. FIG.13B 
shows the extraction condition when an Interval of the 
main scan direction In a rectangle is within an interval 
of a paragraph and 3 times of the height of a character, 
and FIG.1 3C shows the extraction condition to eliminate 
a case where a group is not overiapped with a rectangle 
region other than a character when target rectangles are 
grouped. 

[0068] Condition 1 : An interval of a sub-scan direction 
in a rectangle region is within 0.5 times of the height of 
a character. 

[0069] Condition 2: An interval of a main-scan direc- 
tion in a rectangle region is within an interval of adjacent 
paragraphs and 3 times of the height of a character. 
[0070] Condition 3: It Is not overiapped with a rectan- 
gle region of an attribute other than a character when 
grouped. 

[0071] The grouping operation is repeated until there 
is no longer un-processed rectangle region. Thereby, as 
shown in FIG. 14, rectangle regions that have been ex- 
tracted per line are obtained. 

(B) Condition to extract "paragraph" (see Ftgs.15A and 
158, and FIGS.IBAto 16C) 

[0072] FIG.15A shows the extraction condition when 
there is overiapped in a main scan direction. FIG.1 58 
shows the extraction condition when an interval of a sub- 
scan direction In a rectangle region is within 1 .5 times 
of the height of a character. Fig.lBA shows the extrac- 



tion condition when a difference between heights of rec- 
tangles is within 3pt- Fig.16B shows the extraction con- 
dition when there is an indentation. Fig.16C shows the 
extraction condition to avoid an overiap with another 
5 rectangle other than a line that has been grouped. 
[0073] Condition 1 : There is an overiap in a main scan 
direction. 

[0074] Condition 2: An interval of lines per sub-scan 
direction is within 1 .5 times of the height of a character. 
10 [0075] Condition 3: A difference between heights of 
lines is within 3pt. 

[0076] Condition 4: There is no indentation. 
[0077] Condition 5: There is no overiap with a region 
other than a line when grouped. 
'5 [0078] The grouping process is repeated until there is 
no longer un-processed rectangle region. Thereby, par- 
agraphs can be extracted. 

[0079] By performing the processes described above, 
the original Image data are classified Into rectangle re- 

20 gions corresponding to attributes such as "character", 
"photograph", "ruled line", and "frame". Further, the 
grouping per line or paragraph is periormed for the rec- 
tangle region of the attribute "character" , and the group- 
ing per cell, row, column, and entire table is periormed 

25 for the rectangle region of the attribute "table". 

[0080] The region extract infomiation of the original 
Image data obtained by the processing performed by the 
automata region extracting means 2 and detailed at- 
tribute informatton and others of the extract regions are 

30 stored in the modulation information storage means 3. 
[0081 ] After the region extracting process and the at- 
tribute judgment process are completed, the cursor to 
select the compressed image of the original image and 
the rectangle regions and the rectangle region that is 

35 cun^ently selected are displayed on a LCD (Liquid Crys- 
tal Display) panel as the display means 4, as shown in 
FIG.1 8. 

[0082] On the display panel as the display means 4 
shown in FIG.1 8, the image that has been pre-scanned 

40 in the display area 61 for document is displayed so that 
the image are matched with the width of the display de- 
vice (the LCD panel), and the operation content and the 
state of the modification and the like are displayed on 
the message area 62. In addition, on the display panel 

45 as the display means 4 shown in FIG.1 8, the reference 
number 63 designates a cursor key to be used for se- 
lecting a region and a content of the modification, the 
reference number 64 denotes detenni nation keys to be 
used for detemiining the region, the content of the mod- 

so ification. and also to be used for canceling the deter- 
mined content. The reference symbols F1 to F4 indicate 
function keys for selecting functions according to the 
current situation. The keys 63, 64, F1 , F2, F3, and F4 
described above mean the specifying means 52 in the 

S5 operation means 5. 

[0083] In the first embodiment, the LCD panel of a low 
resolution (320x240 dots) is used as the display means 
4. 
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[0084] When a document is not reduced in use of the 
display means 4 of a tow resolution, only a part of the 
document is displayed. This resolution requires that the 
operator scrolls the cursor around all directions such as 
right, left. up. and down directions in order to watch the 
entire of the document. This causes a drawback to de- 
crease the ease of the operation because the operator 
can hardly recognize which part of the document is dis- 
played on the display means 4. On the contrary, when 
the document is reduced and then displayed. It is not 
necessary to scroll the cursor because the entire of the 
document can be displayed, but the operator can hardly 
watch the reduced document clearly. 
[0085] Accordingly, in the present embodiment, the 
resolution of the original image data is so converted that 
the width of the original image data is fit to the width of 
the LCD panel (as the display means 4) and the operator 
then perform the scrolling only In up and down direc- 
tions. This can increase the ease of the operation when 
compared with the case where the document is not re- 
duced and displayed. In addition, this can increase the 
ease of the operation when compared with the case 
where the document is reduced and the entire of the 
document is displayed. 

[0086] The operator gives following instructions in or- 
der to perform various modifications while watching the 
display contents on the display means 4 described 
above. 

[0087] The operation will be explained with reference 
to the flowchart shown in FIG.22. 
[0088] First, on the display image shown in FIG.18, 
the cursor is displayed on the first extract region (a rec- 
tangle region) in raster order by the modification region 
selection means 51 . 

[0089] (1) At Step S21 . the operator moves the rec- 
tangle region to be modified by operating the cursor key 
63 as the specifying means 52. Fig. 19 shows this state 
in which the rectangle region specified by the cursor Is 
reversed and blinking. When the operator wants to mod- 
ify the rectangle region specified by the cursor, the op- 
erator enters the decision key 64b at Step S22 in order 
to select the rectangle region to be modified. When the 
operator enters the cancel key 64a. the selected rectan- 
gle region is cancelled. 

[0090] The operator repeats the above operations in 
order to perform the modification for all rectangle re- 
gions to be modified by using those keys 63, 64a and 
64b until the specification for the rectangle regions is 
completed at Step 823. 

[0091] By those operations, the target rectangle re- 
gions for the modification are determined and displayed 
in reverse. The target rectangle region to be moved is 
determined by the modification region selection means 
51 based on the position infomnation of the rectangle 
region that Is currently referenced and the direction of 
the cursor key 63 that is now pushed. Further, by using 
a switch key (not shown) in the specifying means 52. 
the operator can switch the selection unit, line or para- 



graph in the character region, and cell, row, column, or 
the entire table in the table region. 
[0092] (2) When the operator pushes the decision key 
64b again or pushes the function key F2 (Modifcation) 

5 while keeping the cursor on the target rectangle region, 
as shown in FIG. 20, the display is switched to the mod- 
ifk:ation menu. The operator moves the cursor key 61 
in the modification menu in order to select one of the 
contents (kinds) of the modification, and pushes the de- 

10 cision key 64b in order to specify the desired content of 
the modifrcation at Step 824. 

[0093] At Step S24, when the content of the modifica- 
tion to the target rectangle region to be modified is de- 
termined, the position infomriation of this target rectan- 

is gte region is obtained by the modifk:ation region selec- 
tion means 5 1 , and the content of the modification is also 
selected by the modification content selection means 
53. These information and the content are stored into 
the modification information storage means 3. At the 

20 same time, as shown in FIG.21 , the reduced image of 
the original image data and an icon specifying the con- 
tent of the modification for the selected target region are 
displayed on the display panel of the display means 4: 
[0094] In the present embodiment, there are contents 

25 of the modification such as "hatching", "delete", "photo- 
graph", "reverse", "frame", "underiine", "hollow", and 
"extract" (delete regions other than the selected region). 
These contents can be selected. The operation of the 
modification is repeated until the completion of the mod- 

30 ifteation is judged at Step S25. 

[0095] (3) After the completion of the designation for 
the modification, the operator specifies to start the main 
scan at Step S26. 

[0096] When the operator specifies to start the main 

35 scan, the modification image making means 6 inputs the 
original image data of a desired resolution (for example, 
400dpi) from the image input means 1, and reads the 
content of the modification, per pixel of the input original 
image data, from the modification information storage 

40 means 3. Further, the modification Image making means 
6 selects the desired image processing (such as simple 
binary processing, photograph processing, and reverse 
processing, and so on) to be processed by the automatic 
region extracting means 2 according to the content of 

4S the modification that has been read. For example, when 
the content of the modification is the photograph 
processing, the desired processing becomes the pho- 
tograph processing, when the reverse processing, the 
desired processing becomes the reverse processing. 

50 [0097] Furthenmore, when the content of the modifi- 
cation designates the framing or the underiine, the mask 
pattern corresponding to its content is made. The image 
output means 7 prints the obtained image on a print pa- 
per and then outputs the print paper. 

55 [0098] As described above, according to the present 
embodiment, the operator can select the target rectan- 
gle region for the modification, to be corresponded to 
the contents of the modifications that have been stored 
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in advance, as the rectangle region con^sponding to 
one of the attributes, "character", "photograph", "table" . 
"ruled lien", and frame". That is, the document as the 
target to be nrjodifred can include the documents in 
which various attributes such as "character", "photo- 
graph", "table", "ruled line", and frame" are mixed. It Is 
thereby possible to perfonm easily and efficiently the 
modification for the image without any increasing the 
workload for the operator. In addition, it is also possible 
to increase the general versatility of the document mod- 
ification apparatus and the image processing apparatus 
equipped with this document modification apparatus. 

Second embodiment 

[0099] FIG.23 is a block diagram showing the image 
processing apparatus equipped with the document 
modification apparatus according to the second embod- 
iment of the present invention. 
[0100] The image processing apparatus comprises: 
an image input means 1 ; an automatic region extracting 
means 2; a modification infomiation storage means 3; 
a display means 4; a modification image making means 
6; an image output means 7; and an automatic modifi- 
cation means 8. 

[0101] The image input means 1 reads a target doc- 
ument to be processed and inputting it. The automatic 
region extracting means 2 extracts a character, a pho- 
tograph, a table, a ruled line, a frame and the like from 
the target document that has been read. The modifica- 
tion infomnation storage means 3 stores extract informa- 
tion (position information and attribute information) rer 
garding the rectangle regions that have been extracted. 
The automatic modification means 8 automatically mod- 
ifies each rectangle region of the input image of the tar- 
get document according to the extract infomiation (po- 
sition infomiation and attribute Infomiation) from the 
modification information storage means 3. The modifi- 
cation image making means 6 makes a modified image 
obtained by modifying the input image of the document 
according to the modification detennined by the auto- 
matic modification means 8. The display means 4 dis- 
plays modified image. The image output means 7 prints 
the modified image on a print sheet and outputting the 
print sheet. In the second embodiment, the same refer- 
ence numbers are used for the same components of the 
first embodiment. 

[0102] Next, a description will be given of the opera- 
tion of the image processing apparatus according to 
second embodiment. 

[0103] The configuration of the image processing ap- 
paratus according to the second embodiment is basical- 
ly equal to that of the first embodiment. The difference 
Is as follows: 

[01 04] In the configuration of the first embodiment, an 
operator specifies the modification type for each rectan- 
gle region. On the other hand, in the configuration of the 
second embodiment, the automatte modification means 



8 can modify automatically the rectangle regions that 

have been extracted from the input image. 

[01 05] Hereinafter, the difference will be explained in 

detail. 

s [01 06] The input image that has been read by the im- 
age input means 1 is stored temporarily into the image 
memory 11 (as the image storage means) in the image 
input means 1. The input image stored in the image 
memory 11 is transferred to both the automata region 

10 extracting means 2 and the modificatk>n image making 
means 6. 

[0107] The automatk: region extracting means 2 ex- 
tracts rectangle regions corresponding to one of at- 
tributes such as "character", "photograph", "table", 

15 "ruled line", and "frame", from the input image, and then 
stores rectangle information of the rectangle regions 
that have been extracted (position infomiation about 
rectangle regions that have been extracted and attribute 
information about those rectangle regions) into the mod- 

20 ifk»tton information storage means 3. 

[01 08] The automatic modification means 8 compris- 
es a memory (not shown), for example, which stores the 
table shown in FIG.24 in which the attributes and con- 
tents of the modification corresponding to the positions 

25 of the rectangle region that have been extracted as the 
automatic modification infomiation. The automatic mod- 
ification means 8 detennines the contents of the modi- 
fication to be applied to the rectangle regions that haven 
been extracted from the original image. For example, 

30 the automatic modification means 8 determines to per- 
form the contents of the modification (for example, the 
"header* part is reversed and the fable" section is shad- 
ed) based on the attributes and the features stored in 
the modification Information storage means 3. 

35 [0109] The modification image making means 6 per- 
forms the contents of the modification, for example, "re- 
versing", "shading", and so on, detemnined by the auto- 
matic modification means 8 and makes the mask pattern 
of the input image after the modification, and then out- 

40 puts the mask pattern to the display means 4 and the 
image output means 7. The operator can recognize the 
content of the modification to the modified image dis- 
played on the display means 4. The Image output means 
7 prints the modified image on a printing sheet and then 

<s outputs the printing sheet. 

[0110] By the way, it is possible to change the con- 
tents in the table stored in the automatic modification 
means 8 by the operator 

[0111] As described above, according to the second 
50 embodiment, the rectangle regions con-esponding to the 
attributes such as "character", "photograph", "table", 
frame", "ruled line", and so on are extracted from the 
original image, and modifies automatically the extracted 
rectangle regions according to the contents for the mod- 
55 ificatlon set in the table In advance. 

[0112] Accordingly, the second embodiment can be 
efficiently applied to the case where the contents of the 
modification Is fixed, for example, applied to a document 
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such as an advertisement because ft is possible to au- 
tomatlcalty perfonn the modification to the document 
without necessary of any operator's work. 
[0113] By the way, although the image processing ap- 
paratus includes the display means in both the configu- 5 
rations of the first and second embodiments, It is possi- 
ble to eliminate the display means from the configuration 
and also to obtain the same effect because it is not nec- 
essary to incorporate the display means. 
[0114] Both the first and second embodiments have io 
been explained by using the documents of lateral lines 
in lateral writing, but the present invention is not limited 
by these cases, it Is also possible for the present inven- 
tion to apply documents of vertical lines in vertical writing 
such as Japanese documents by switching the process 
of the main-scan direction with the process of the sub- 
scan direction. 

[0115] Furthemiore, although the image processing 
apparatus according to both the first and second em- 
bodiments use the LCD panel as the display means, It ^ 
is possible to use a CRT display instead of the LCD pan- 
el. Moreover, the first embodiment uses the key input 
method as the specifying means 52 in the operation 
means 5, but the present invention is not limited by this 
case, for example It is possible to use a mouse, a touch 
panel, or another method. 

[01 1 6] In addition, the image processing apparatuses 
of both the first and second embodiments are capable 
of performing the modification to the input image per line 
or paragraph when the result of the judgment of the at- 30 
tribute specifies the attribute "character". However, the 
present invention is not limited by this operation, for ex- 
ample, it is also possible to perform the cutting process 
to estimate the interval of adjacent lines based on the 
height of a character, and to perfomn the modification 35 
per line. 

[0117] Further, when the Image input means in both 
the first and second embodiments is capable of inputting 
directly binary data during the pre-scan process, It Is 
possible to eliminate the binarization means 21 from the 
configurations of the first and second embodiments, and 
when capable of inputting directly multi-value data, it Is 
possible for the automatk; region extracting means 2 to 
extract regions from multi-value data and to perfomi the 
modification for the extract regions. 
[01 18] As set forth in detail, according to the present 
invention, because rectangle regions corresponding to 
the various attributes such as "character", "photograph', 
"table", "ruled line", and "frame" can be extracted from 
input image, it is possible to apply the present Invention so 
to target documents which.involve mixed attributes such 
as characters, photographs, tables, ruled lines, and 
frames and also possible to increasing the general ver- 
satility of the image processing apparatus. 
[0119] Furthennore, according to the present inven- ss 
tion, because the Instruction of the modlfteatlon can be 
performed per line, it is possible to reduce the operator's 
load and thereby possible to reduce the working time of 



the modifk:ation for target document image. 
[0120] Moreover, according to the present invention, 
because the rectangle regions that have been extracted 
from the input image can be displayed in addition to the 
Input image, it is possible to easily and efficiently select 
the target rectangle regions to be modified. 
[0121] Furthermore, according to the present inven- 
tion, because the rectangle regions that have been ex- 
tracted are blinking on the display means, It Is possible 
for the operator to smoothly select the target rectangle 
regions to be modified without any missing the target 
rectangle regions. 

[0122] In addition, according to the present invention, 
it is possible to easily specify a modification displayed 
in the menu, to be applied to the selected rectangle re- 
gion. 

[0123] Furthermore, according to the present inven- 
tion, the Input Image is reduced in size, so that the it is 
possible to display the Input Image or the rectangle re- 
gions that haven been extracted according to the display 
size of the LCD panel in the display means, and it is 
thereby possible to increase the ease of the operation 
in the selection of the target rectangle region and the 
modification. 

[0124] Moreover, according to the present invention, 
because the rectangle regions of the input image to be 
modified are selected automatically and the kinds of the 
modification are also determined automatically, it is pos- 
sible to perform the modification to the image of each 
target rectangle region to be modified without receiving 
any Instruction fomn the operator. 
[0125] Furthermore, according to the present inven- 
tion, because the documents including various kinds of 
the attributes such as "character", "photograph", "table", 
ruled line", "frame", and so on can be modified easily, it 
is thereby possible to modify the image of the document 
efficiently and also to increase the general versatility of 
the image processing apparatus equipped with docu- 
ment modification apparatus. 

[0126] While the above provides a full and complete 
disclosure of the preferred embodiments of the present 
Invention, various modifications, alternate constructions 
and equivalents may be employed without departing 
from the scope of the invention. Therefore the above de- 
scription and illustration should not be constaied as lim- 
iting the scope of the invention, which is defined by the 
appended claims. 



Claims 

1. A document modification apparatus for modifying 
image data read by image input means, comprising: 

region extracting means for extracting a plural- 
ity of regions from the image data, each region 
being a unit to be modified; 
region selection means for selecting target re- 
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gions to be modified from the plurafity of re- 
gions through an operator; 
modifrcation specifying means for specifying 
kinds of the modifications for the target regions 
selected by the region selection means through s 
the operator and 

modification image making means for making 
a modified Image, based on the kinds of the 
modiftcattons, in the regions in the image data 
selected by the region selection means, sped- io 
fied by the modifteation specifying means. 

2. The document modification means according to 

claim 1 , 

wherein the region extracting means extracts 15 
rectangle regions as the target regions to be modi- 
fied, and the region extracting means comprises re- 
gion attribute judgment means for judging an at- 
tribute for each rectangle region. 

20 

3. The document modifk:ation means according to 
claim 2, 

wherein the region attribute judgment means 
judges whether an attribute of each rectangle re- 
gion that has been extracted Is one of attributes 2s 
such as "character", "photograph", "table", "ruled 
line", and Irame**. 

4. The document modification means according to 
claims, 30 

wherein the region extracting means integrates 
the rectangle region, whose attribute has been 
■ judged as "character^ by the region attribute 
judgment means, per line and paragraph, and 35 
the region selection means selects the target 
region to be modified per line and paragraph 
through the operator. 

5. The document modification means according to ^ 
claim 1 , 

wherein the region extracting means displays 
on a display screen the rectangle regions extracted 
by the region extracting means with the Image data 
read by the image input means, and selects wheth- 
er each rectangle region on the display screen Is 
modified or not through the operator. 

6. The document modification means according to 
claim 1 . so 

wherein the modification Instruction means 
displays an at-a-glance menu showing the infomna- 
tion regarding the kinds of the modification, and se- 
lects the modification, to be applied to the selected 
rectangle regions, from the kinds of the modiftca- ss 
tions shown In the at-a-giance menu through the op- 
erator. 



7. The document modification means according to 
claim 1 , 

wherein the modification Image making means 
comprises memory means for storing position 
informatran of the selected rectangle regions by 
the region selection means and the modifica- 
tion infomnatton regarding the kinds of the mod- 
ifications specified by the modification specify- 
ing means, 

and the modification image making means per- 
forms the modification for the image data read 
by the image input means based on the position 
information and the modlfk^atton stored in the 
memory means. 

8. The document modification means according to 
claim 1, further comprises resolution conversion 
means for changing a resolution of the input image 
data to a reduced image; and display means for dis- 
playing the reduced image obtained by the resolu- 
tion conversion means with the rectangle regions 
extracted by the region extracting means. 

9. A document modification apparatus for modifying 
image data read by image Input means, comprising: 

region extracting means for extracting a plural- 
ity of regions from the image data, each region 
being a unit to be modified; 
automatic modification means for automatically 
selecting target regions to be modified from the 
plurality of regions, and for automatically mod- 
ifying the selected target regions based on 
modifications that have been set in advance; • 
and 

modification image making means for making 
an image modified Image in the target regions 
selected by the automatic modification means 
based on the kinds of the modlfk^atlons deter- 
mined by the automatic modification means. 

10. The document modification means according to 
claim 9, 

wherein the automatic modification means 
detemiines the kind of the modification to be applied 
to each selected target region in consideration of 
the attribute for the selected target region and the 
position of the selected target region in the input im- 
age data. 

11. The document modification means according to 
claim 9, 

wherein the region extracting means comprises 
region attribute judgment means for judging an 
attribute of each region, 
and the attribute of each region to be judged by 
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the region attribute judgment means is one of 
attributes "character", "photograph", "table", 
"ruled line", and "frame". 

12. The document modification means according to s 
claim 1, 

wherein the image input means converts the 
input innage data to binary image data. 

13. The document modification means according to io 
claim 9, 

wherein the image input means converts the 
input image data to binary Image data. 

14. An Image processing apparatus comprising: is 

image input means for reading image data from 
a document; 

the document modification apparatus, as 
claimed in claim 1 , for making modified image so 
by modifying the input image data obtained by 
the image input means; and 
image output means for outputting the modified 
image obtained by the document modification 
apparatus. 2S 

15. An image processing apparatus comprising: 

image input means for reading image data from 
a document; so 
the document modification apparatus, as 
claimed in claim 9. for making modified image 
by modifying the input image data obtained by 
the image input means; and 
image output means for outputting the modified 3S 
image obtained by the document modification 
apparatus. 
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